The Role of Rare Terms in Enhancing the Performance of Polynomial Networks Based Text Categorization

نویسنده

  • Mayy M. Al-Tahrawi
چکیده

In this paper, the role of rare or infrequent terms in enhancing the accuracy of English Text Categorization using Polynomial Networks (PNs) is investigated. To study the impact of rare terms in enhancing the accuracy of PNs-based text categorization, different term reduction criteria as well as different term weighting schemes were experimented on the Reuters Corpus using PNs. Each term weighting scheme on each reduced term set was tested once keeping the rare terms and another time removing them. All the experiments conducted in this research show that keeping rare terms substantially improves the performance of Polynomial Networks in Text Categorization, regardless of the term reduction method, the number of terms used in classification, or the term weighting scheme adopted.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient symmetric polynomial-based key establishment protocol for wireless sensor networks

An essential requirement for providing secure services in wireless sensor networks is the ability to establish pairwise keys among sensors. Due to resource constraints on the sensors, the key establishment scheme should not create significant overhead. To date, several key establishment schemes have been proposed. Some of these have appropriate connectivity and resistance against key exposure, ...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...

متن کامل

New Crisis Management Technologies in the Red Crescent Society of the Islamic Republic of Iran

INTRODUCTION: The Islamic Republic of Iran Red Crescent Society, which is one of the elements of the country’s crisis management organization, is always present at the scene from the earliest moments in the critical situations such as floods, earthquakes, fires, etc. and deals with the crisis and facilitates the situation. Given the increasing role of technology in all aspects of human life, th...

متن کامل

Omega and PIv Polynomial in Dyck Graph-like Z(8)-Unit Networks

Design of crystal-like lattices can be achieved by using some net operations. Hypothetical networks, thus obtained, can be characterized in their topology by various counting polynomials and topological indices derived from them. The networks herein presented are related to the Dyck graph and described in terms of Omega polynomial and PIv polynomials.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013